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Abstract 

In this work we apply tools developed for the study of fractal properties of time 
series to the problem of classifying defects in welding joints probed by ultrasonic 
tecniques. We employ the fractal tools in a preprocessing step, producing curves with 
a considerably smaller number of points than in the original signals. These curves 
are then used in the classification step, which is realized by applying an extension 
of the Karhunen-Loeve linear transformation. We show that our approach leads to 
small error rates, comparable with those obtained by using more time-consuming 
methods based on non-linear classifiers. 



1 Introduction 



Ultrasonic tests can serve as a useful tool for evaluating the integrity of metallic 
structures, and specially of weld joints. By inspecting the scattering pattern 
of ultrasonic waves propagating in the material, it is possible to identify the 
presence of defects, and to estimate their dimensions. However, it is often 
desirable to have precise information about the nature of the defects, and a 
number of studies have tried to propose useful approaches to perform such 
classification [Tpf3f4] . mostly based on direct analysis of the patterns with 
neural networks. 



In the present paper, we describe a distinct approach, based on tools developed 
for analyzing fractal properties of time series [5|6f7f8] . Such kind of approach 
has been successfully applied to ultrasonic signals (interpreted as a particular 
kind of time series) by a number of authors [9;10yil|. both in defect and in 
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microstructure classification, by calculating the exponents of the power laws 
characterizing various fractal features of the series, and hoping to associate 
different sets of exponents with different classes. However, this can only be 
expected to work when the typical series corresponding to different classes 
are highly dissimilar, due to the fact that the size of each series is usually 
small, and estimates of the various exponents are subject to significant fluctu- 
ations. Thus, in general, an expanded set of features must be used to obtain 
an efficient classification algorithm. Here we employ tools from the statisti- 
cal pattern-classification literature to extract relevant features from the set of 
fractal analyses applied to ultrasonic signals obtained from weld joints having 
three different kinds of defects. 

We used 240 ultrasonic signals obtained by the TOFD tecnique [12], with 60 
signals corresponding to each kind of defect (lack of fusion, lack of penetra- 
tion, and porosity) and other 60 signals from regions with no defects. (For 
a description of the materials used, as well as tecniques for producing and 
capturing the signals, see Ref. [3i|.) All signals had a length of 512 points, 
with 8-bit resolution. Typical signals are shown in Fig. [TJ After normalizing 
all signals so that the maximum and minimum values correspond to 1 and 
— 1, we calculated the corresponding curves from four different techniques of 
fractal analysis, which we describe in Sec. [2l Then, as described in Sec. [H 
we employed a variation of the Karhunen-Loeve (KT) linear transformation 
[T3|T4] to extract relevant features from the curves. As we discuss in the fi- 
nal section, the combined approach of fractal analysis and KT transformation 
yields a quite good classification tool for the defects studied. 



2 Fractal analysis 

All techniques of fractal analysis employed here start by dividing the signal 
into intervals containing r points. Each technique then involves the calculation 
of the average of some quantity Q (r) over all intervals, for different values of 
r. In a signal with genuine fractal features, Q (r) should scale as a power of r, 

g(r)~r^ (1) 

at least in an intermediate interval of values of r, corresponding to 1 ^ r ^ L, 
L being the signal length. 

Fractals of different nature should give rise to different exponents ?7, providing 
a signature of the fractal. In our case, due to the finite amount of points, 
and to the very nature of the signals, a pure power-law behavior is hard to 
observe. Instead, as shown in Fig. [2 the curves usually exhibit features such 
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Figure 1. Typical examples of signals obtained from samples with (a) lack-of-fusion 
defects, (b) lack-of-penetration defects, (c) porosities, and (d) no defects. The hori- 
zontal axes correspond to the time direction, in units of the inverse sample rate of 
the equipment. 

as a crossover between different power-law behaviors, or saturation points, 
which can also serve as signatures of the different kinds of defects. However, 
identifying the relevant features in advance is a complex task. Fortunately, the 
pattern-classification literature offers useful tools for feature extraction from 
data, and we describe one of those in Sec. [3] and Appendix [Al 



2.1 Hurst (R/S) analysis 

The rescaled-range (R/S) analysis was introduced by Hurst [5] as a tool for 
evaluating the persistency or antipersistency of a time series. The method 
works by dividing the series into intervals of a given size, and calculating the 
average ratio of the range (the difference between the maximum and minimum 
values of the series) to the standard deviation inside each interval. The size of 
each interval is then varied. 

Mathematically, the R/S analysis is defined in the following way. Given an 
interval of size r, whose left end is located at point io, we calculate {z)r, the 
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Figure 2. Curves for a lack-of-fusion signal, obtained from (a) Hurst analysis, (b) de- 
trended-fluctuation analysis, (c) minimal-cover analysis, and (d) box-counting anal- 
ysis. 



average of the series Zi inside the interval, 



1 



io+T—l 



i=io 



(2) 



We then define an accumulated deviation from the mean as 

i 

Zi=Yl i^k - {z)r) , 



(3) 



from which we extract a range, 

R{t) = max Zi — min Z^. 



(4) 



and the corresponding standard deviation, 
S(t) = 



1 io+r-1 
i=io 



(5) 



Finally, we obtain the rescaled range R{t) / S{t) , and take its average over all 
intervals. 
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For a curve with true fractal features, the rescaled range should satisfy the 
scaling form 



Rjr) 
S{r) 



(6) 



where H is the Hurst exponent. 

A typical curve obtained from the R/S analysis of the signals is shown in Fig. 
EKa). 



2.2 Detrended- fluctuation analysis 



The detrended-fluctuation analysis (DFA) [6] aims to improve the evaluation 
of correlations in a time series by eliminating trends in the data. 



The method consists initially in obtaining a new integrated series z 
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^i = J2(^k- {z)), (7) 

k=l 



the average {z) being taken over all points, 

(^) = 7E^.- (8) 

^ i=i 



After dividing the series into intervals, the points inside a given interval are 
fitted by a polynomial curve of degree n. In our case, we have considered n = 1 
or n = 2, corresponding to first- and second-order fits. Then, a detrended 
variation function Aj „ is obtained by subtracting from the integrated data 
the local trend as given by the fit. Explicitly, we define 

^i,n hin, (9) 



where hi^n is the value associated with point i according to the fit of degree n. 
Finally, we calculate the root-mean-square fiuctuation Fn{T) inside an interval 

as 



-E^•n' (10) 
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and average over all intervals. For a true fractal curve, F{t) should behave as 
F(r) ~ r", (11) 



where a is the scaling exponent. 

A typical curve obtained from the detrented-fluctuation analysis of the signals 
is shown in Fig. [2](b). 



2.3 Minimal-cover analysis 

This recently introduced method [7] relies on the calculation of the minimal 
area necessary to cover a given plane curve at a specified scale. 

After dividing the series, we can associate with each interval, labeled by a 
variable k, a rectangle of height H^, defined as the difference between the 
maximum and minimum values of the series Zi inside the kth interval, 

Hk = max Zi — min Zi, (12) 

in which io = 1 + (fc — 1) r labels the left end of the interval. The minimal 
area is then given by 

AiT)=J2H,T, (13) 

k 

the summation running over all cells. 

Ideally, in the scaling region, A{t) should behave as 

^(r)~r2-^^ (14) 

where is the minimal cover dimension, which is equal to 1 when the signal 
presents no fractality. 

A typical curve obtained from the minimal-cover analysis of a signal is shown 
in Fig.[2](c). 
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2-4 Box- counting analysis 



This is a well-know method of estimating the fractal dimension of a point set 
[8], and it works by counting the minimum number N (t) of boxes of side r 
needed to cover all points in the set. For a real fractal, (r) should follow a 
power law whose exponent is the box-counting dimension Db, 

N{T)r^T-^^. (15) 

A typical box-counting curve for a signal is shown in Fig. [2](d) . 



3 Results of the classification approach 

In order to classify the signals, we used a supervised variation of the Karhunen- 
Loeve (KL) transformation [T31IT4] . briefly described in Appendix [Al For each 
signal, we collected the corresponding curves from various fractal analyses, 
forming a single vector with M components. The most successfull combination 
involves curves from Hurst, linear detrended-fluctuation, minimal-cover, and 
box-counting analyses, corresponding to M = 108 (with 27 components of 
the vector taken from each curve). A plot obtained by projecting the flrst 
two components of the KL-transformed vectors is shown in Fig. [3l for the 
full set of vectors. (Note that, with 4 different classes for the vectors, the 
transformed space is three-dimensional.) It is evident from the flgure that the 
transformation yields a good clustering of the vectors around the different 
class means. This clustering is a general feature of the KL transformation. 
However, to assess the utility of the classification approach, it is essential to 
evaluate the generalization error. 

We proceeded by first randomly dividing the vectors into a training set (with 
80% of the signals) and a test set (with the remaining signals). The KL trans- 
formation was first applied to the training vectors, and the class means were 
determined. Transformed vectors in both sets were then classified by applying 
the nearest-class-mean rule, i.e., a vector x was assigned to the class whose 
average vector, as determined by the training set, lies closer to x. (It is also 
possible to explore other approaches for discrimination, such as Bayesian rules 
, but that would require an estimation of the class-conditional probabilities, 
which we do not have at hand.) Finally, we took averages over 500 different 
choices of training and test sets. 

The average confusion matrices of the training and test sets are shown in Ta- 
bles [T] and [21 Notice that the mean error rate is negligible for the training 
vectors, and corresponds to around 15% for the test vectors. These error rates 
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Figure 3. Projection, along the first two components, of the vectors obtained by 
applying the Karhunen-Loeve transformation to the to the full data set obtained 
from four different fractal analyses. 



Table 1 

Average confusion matrix for the training vectors, derived from the fractal analyses. 
The possible classes are lack of fusion (LF), lack of penetration (LP), porosity (PO) 
and no defects (ND). The figures in parenthesis indicate the standard deviations, 
calculated over 500 sets. The value in row i, column j indicates the percentage of 
vectors belonging to class i which were associated with class j. 





LF 


LP 


PO 


ND 


LF 


100 











LP 





99.87 (0.50) 


0.13 (0.50) 





PO 





0.01 (0.10) 


99.99 (0.10) 





ND 








0.01 (0.08) 


99.99 (0.08) 



are comparable to those obtained by analyzing the same signals directly using 
non-linear classifiers based on neural networks [4]; the use of linear classifiers, 
on the other hand, leads to considerably higher error rates [3]. Notice that in 
our study the number of variables (108) employed in the classification step 
represents only around 1/5 of the number used in the neural-network studies 
(which made use of all 512 points of each signal). Besides rendering the cal- 
culations faster, for an equivalent error rate, the smaller number of variables 
also leads to smaller fluctuations in the curves. 

For completeness, we also applied the KT transformation to both the correlo- 
grams and the Fourier spectra of each signal, obtaining average error rates no 
smaller than 36% and 48%, respectively (see Table [3j). 
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Table 2 

The s ame as in Table [H for the testing vectors. 





LF 


LP 


PO 


ND 


LF 


91.07 (8.20) 


1.69 (3.64) 


6.88 (7.37) 


0.35 (1.69) 


LP 


2.61 (8.20) 


83.96 (10.04) 


12.14 (9.27) 


1.28 (3.21) 


PO 


6.43 (7.27) 


13.99 (10.5) 


72.66 (12.87) 


6.92 (7.55) 


ND 


1.01 (3.25) 


2.55 (4.43) 


6.92 (7.16) 


89.51 (8.93) 



Table 3 

Average percentage rates of correct classification of the test vectors, derived from 
applying the KL transformation to the correlograms or the Fourier spectra associated 
with the signals. The figures in parenthesis indicate standard deviations calculated 
over 100 sets. 





LF 


LP 


PO 


ND 


Correlograms 


60.19 (17.75) 


64.18 (14.96) 


51.60 (16.87) 


57.43 (16.16) 


Fourier spectra 


51.42 (17.49) 


47.35 (17.03) 


46.76 (16.64) 


48.08 (15.76) 



4 Conclusions 



In this paper we applied techniques developed for the study of fractal proper- 
ties of time series as a preprocessing tool for the classification of defects probed 
by ultrasonic signals. The signals vi^ere obtained in vi^elding joints containing 
three different classes of defects, and we also considered signals with no de- 
fects. For the classification step, we employed an extension of the Karhunen- 
Loeve transformation, which, supplemented by the nearest-class-mean rule, 
yielded low error rates (between and 15%) both in the training and the test 
stages. These error rates are comparable with those obtained from more time- 
consuming approaches based on direct analysis of the signals. In our view, this 
is evidence that fractal techniques are a promising tool for the classification 
of defects probed by ultrasonic inspection. 

We believe that the performance of the classification approach based on fractal 
techniques can be further improved by resorting to non-linear classifiers, espe- 
cially in combination with reclassification and hierarchical procedures. These 
extensions we leave for future investigations. 
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A Karhunen-Loeve transformation 



The Karhunen-Loeve (KL) transformation, as the principal component anal- 
ysis, is a tool for feature selection and extraction. It produces a set of mutu- 
ally uncorrelated components, and dimensionality reduction can be achieved 
by selecting those components with the largest variances. The version of the 
transformation employed here [Kittler and Young, 1973 (Webb)] relies on com- 
pression of the discriminatory information contained in the class means. 

Let Xj be the (column) vector corresponding to the ith signal. The KL transfor- 
mation consists of first projecting the training vectors along the eigenvectors 
of the within-class covariance matrix S^^, defined by 

I Nc Nk 

= ^J^J2yik{^~^k){^-nik)'^, (A.i) 

k=l 1=1 



where Nc is the number of different classes, is the number of vectors in 
class k, nifc is the average vector of class /c, and T denotes the transpose of a 
matrix (in this case, yielding a row vector). The element ijik is equal to one if 
Xj belongs to class k, and zero otherwise. We also rescale the resulting vectors 
by a diagonal matrix built from the eigenvalues \j of Sw- In matrix notation, 
this operation can be written as 

X' = A-5U^X, (A.2) 



in which X is the matrix whose columns are the training vectors Xj, A = 
diag(Ai, A2, ...), and U is the matrix whose columns are the eigenvectors of 
Sw- This choice of coordinates makes sure that the transformed within-class 
covariance matrix corresponds to the unit matrix. Finally, in order to compress 
the class information, we project the resulting vectors onto the eigenvectors 
of the between-class covariance matrix S^, 

= Y.J7i'^k-i^)imk-i^)^, (A. 3) 

k=l 



where m is the overall average vector. The full transformation can be written 
as 

X" = V^A-5U^x, (A.4) 
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V being the matrix whose columns are the eigenvectors of (calculated from 
X'). 



With Nc possible classes, the fully-transformed vectors have at most Nq — 
1 relevant components. We then associate a vector Xj with the class whose 
average vector lies closer to Xj within the transformed {Nc — l)-dimensional 
space. This association rule would be optimal if the vectors in different classes 
followed normal distributions. 
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